Skip to content

Refactor workflow to DAG architecture with expanded testing, CI modernisation, and validated numerical equivalence#281

Merged
harryswift01 merged 115 commits intomainfrom
173-refactor-levels
Feb 26, 2026
Merged

Refactor workflow to DAG architecture with expanded testing, CI modernisation, and validated numerical equivalence#281
harryswift01 merged 115 commits intomainfrom
173-refactor-levels

Conversation

@harryswift01
Copy link
Member

@harryswift01 harryswift01 commented Feb 25, 2026

Summary

This PR introduces a major architectural refactor that transitions CodeEntropy to a DAG-based execution model while preserving numerical behaviour. The refactor improves modularity, maintainability, and testability, and modernises the project’s testing, CI, and tooling infrastructure.

All systems and component-level contributions match the previous implementation within floating-point tolerance (maximum absolute difference 2.45e-08).

No breaking changes to user-facing behaviour are expected.

Motivation

The previous workflow was primarily procedural, which made it harder to reason about execution order, extend functionality, and test individual stages. Moving to a DAG model improves separation of concerns, enables clearer data flow, and provides a stronger foundation for future development while maintaining numerical equivalence.

Changes

DAG-based workflow architecture

  • Refines orchestration separating static setup, per-frame execution, and reducers.
  • Decomposes workflow into smaller nodes (detection, bead construction, covariance, reducers).
  • Standardises shared context passing between nodes.
  • Implements incremental reduction (streaming mean) for covariance accumulation.

Frame-level covariance computation

  • Introduces FrameCovarianceNode for per-frame second-moment matrices.
  • Adds optional combined force-torque block matrix generation at highest level.
  • Standardises axis handling via axes_manager.
  • Improves robustness for missing beads and metadata.

CLI and job-folder execution model

  • Ensures consistent job folder creation for each run.
  • Guarantees output artifacts land in job directories.

ResultsReporter improvements

  • Reworks JSON output into grouped structure:
    groups: { "": { components: {...}, total: ... } }
  • Groups console tables by Group ID.
  • Adds metadata sections (args, provenance).
  • Adds utilities for argument serialization and git SHA detection.

Example output JSON.

{
  "args": {
    "top_traj_file": [
      "/home/tdo96567/BioSim/test_data/dna/md_A4_dna.tpr",
      "/home/tdo96567/BioSim/test_data/dna/md_A4_dna_xf.trr"
    ],
    "force_file": null,
    "file_format": null,
    "kcal_force_units": false,
    "selection_string": "all",
    "start": 0,
    "end": 1,
    "step": 1,
    "bin_width": 30,
    "temperature": 298.0,
    "verbose": false,
    "output_file": "/home/tdo96567/BioSim/temp/refactor/1-frame/job001/output_file.json",
    "force_partitioning": 0.5,
    "water_entropy": true,
    "grouping": "molecules",
    "combined_forcetorque": true,
    "customised_axes": true
  },
  "provenance": {
    "python": "3.14.0",
    "platform": "Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39",
    "codeentropy_version": "1.0.7",
    "git_sha": "cb22762349b99c149f13392d9280acb4dffec976"
  },
  "groups": {
    "0": {
      "components": {
        "united_atom:Transvibrational": 0.0,
        "united_atom:Rovibrational": 0.002160679012128457,
        "residue:Transvibrational": 0.0,
        "residue:Rovibrational": 3.376800684085249,
        "polymer:FTmat-Transvibrational": 12.341104347192612,
        "polymer:FTmat-Rovibrational": 0.0,
        "united_atom:Conformational": 7.269386795471401,
        "residue:Conformational": 0.0
      },
      "total": 22.989452505761392
    },
    "1": {
      "components": {
        "united_atom:Transvibrational": 0.0,
        "united_atom:Rovibrational": 0.01846427765949586,
        "residue:Transvibrational": 0.0,
        "residue:Rovibrational": 2.3863201082544565,
        "polymer:FTmat-Transvibrational": 11.11037253388596,
        "polymer:FTmat-Rovibrational": 0.0,
        "united_atom:Conformational": 6.410455987098191,
        "residue:Conformational": 0.46183561256411515
      },
      "total": 20.387448519462218
    }
  }

Testing architecture overhaul

  • Expands unit test coverage across previously untested branches.
  • Adds regression test harness running CLI in isolated temp directories.
  • Ensures deterministic job folder creation during tests.
  • Adds baseline comparison against stored JSON outputs.

Regression dataset system

  • Automatic dataset download from CCPBioSim HTTPS filestore.
  • Local caching in .testdata/.
  • Intelligent detection of required files.
  • No manual setup required.

Quick vs slow regression separation

  • Introduces slow marker for long-running systems.
  • Quick regression suite excludes slow tests for fast feedback.
  • Full regression suite runs in weekly workflows.

Developer and tooling improvements

  • Standardises pytest markers and commands.
  • Adds --update-baselines workflow.
  • Improves debug diagnostics and artifact capture.
  • Migrates linting and formatting to Ruff.
  • Removes Black, Flake8, and isort.
  • Updates pre-commit configuration.
  • Simplifies optional dependencies.

CI/CD modernisation

  • Multi-OS testing (Linux, macOS, Windows).
  • Python matrix (3.12–3.14).
  • Quick regression tests on PRs.
  • Weekly full regression workflow.
  • Weekly docs build across Python versions.
  • Daily validation workflow.
  • Artifact upload on failures.
  • Dataset caching in CI.

Documentation pipeline

  • Docs build validation on PRs.
  • Weekly docs compatibility checks.
  • Updates developer guide reflecting new workflows.

Logging and error handling improvements

  • Eliminates duplicate traceback logging.
  • Centralises error boundary in CLI.
  • Improves exception chaining.
  • Adds argument logging on runtime failures.

Impact

  • Improves maintainability through DAG decomposition.
  • Increases confidence via expanded unit and regression coverage.
  • Provides faster CI feedback with quick regression separation.
  • Improves reproducibility with provenance metadata.
  • Simplifies developer setup with automatic datasets.
  • Modernises tooling stack with faster linting.
  • Improves cross-platform reliability via expanded CI matrices.
  • Provides clearer debugging through improved logging and artifacts.

Regression validation results

CodeEntropy Graph Implementation.xlsx

Entropy outputs from the refactored DAG workflow were compared against the previous implementation across all systems and component types. All values agree within floating-point tolerance.

Maximum absolute difference across all systems: 2.45e-08

This comparison confirms agreement across all individual contributions, not only group totals. Observed differences are consistent with expected floating-point variation introduced by consolidating numerical operations into NumPy.

- Renamed `CodeEntropy/levels/structual_analysis.py` -> `CodeEntropy/levels/dihedral_analysis.py` to more accuatly define what this Class does
- Introduced a new class within the module `CodeEntropy/levels/neighbours.py`
- `VibrationalEntropy` class -> own dedicated file within `CodeEntropy/entropy/nodes/vibrational_entropy.py`
- `ConformationalEntropy` class -> own dedicated file within `CodeEntropy/entropy/nodes/configurational_entropy.py`
- `OrientationalEntropy` class -> own dedicated file within `CodeEntropy/entropy/nodes/orientational_entropy.py`
- Created a placeholder for the new graph builder `CodeEntropy/entropy/entropy_graph.py`
- Arguments are added to the `output_file.json`
- Provenance added to the `output_file.json`
- Tidied output logging in the `output_file.json`
Implement regression framework with:
- baseline JSON comparisons
- automatic dataset download from filestore
- .testdata cache
- slow test markers
- config-driven system tests
- CI workflows for quick PR checks and weekly full regression
This provides reproducible validation of scientific results across releases.
…ession:

Run unit tests across all supported OS and Python versions, add quick regression
suite to PRs with .testdata caching, and configure weekly workflow to run full
regression including slow tests. Simplify docs builds to latest environment.
- Add badges for PR checks, daily tests, weekly regression, and weekly docs.
- Remove obsolete workflow badges and align README with current CI setup.
- Replace black, flake8, and isort with Ruff for linting and formatting.
- Update pre-commit configuration and dependencies, add Ruff config to
pyproject.toml, and apply automatic fixes across the codebase.
…or handling:

- Avoid double logging of exceptions by centralising traceback reporting in the CLI.
- Runtime now raises clean errors while preserving original exception chaining.
- Add ResultsReporter progress context manager
- Propagate optional progress sink through workflow orchestration
- Add progress reporting for:
  - Conformational state construction (per group)
  - Frame processing stage (per frame)
- Keep entropy graph execution silent due to fast runtime
- Update runtime tests to reflect wrapped RuntimeError behavior
…tooling:

- Add instructions for unit vs regression test suites
- Document slow test markers and how to run them
- Explain automatic regression dataset downloads via filestore
- Add guidance for updating regression baselines
- Update coding standards to use Ruff instead of Black/Flake8/isort
- Document multi-OS and multi-Python CI workflows
- Clarify developer setup and testing commands
- Remove outdated tooling references
@harryswift01 harryswift01 requested a review from jimboid February 25, 2026 17:10
Copy link
Member

@jimboid jimboid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This represents a significant milestone in the restructuring of the code base into a modern DAG based architecture. This brings significant simplifications to how the code is structured, eliminates the multi-nested and tangled concerns. This will serve as a launchpad to bring in a much more useful API, task-based parallelism whilst being truly extensible into the future. Well done with this, this is a significant step forward in managing the technical debt in this project and put it on a solid footing for future development and sustainability.

One suggestion if the code coverage is still failing is to let the step continue on fail, this way it will still let CI pass for testing purposes and will show the cached coverage on main when broken.

@harryswift01 harryswift01 merged commit 41fc6bd into main Feb 26, 2026
13 checks passed
@harryswift01 harryswift01 deleted the 173-refactor-levels branch February 26, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request

Projects

None yet

2 participants